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(1) 



GENERAL INFORMATION: 



(i) 
(ii) 

(iii) 
(iv) 



(vi) 



APPLICANTS : 
TITLE OF INVENTION: 

NUMBER OF SEQUENCES: 
CORRESPONDENCE ADDRESS 

(A) ADDRESSEE: 

(B) STREET: 

(C) CITY: 

(D) STATE: 

(E) COUNTRY: 

(F) ZIP: 



SEQUENCE LISTING 

Aaron Kaplan et al. 

ENHANCING INORGANIC CARBON FIXATION BY 
PHOTOSYNTHETIC ORGANISMS 
9 

Mark M. Friedman c/o Anthony Castorina 
2001 Jefferson Davis Highway, Suite 207 
Arlington 
Virginia 

United States of America 
22202 



COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: 

(B) COMPUTER: 

(C) OPEEU^TING SYSTEM: 

(D) SOFTWARE: 



1.44 megabyte, 3.5" microdisk 
Twinhead Slimnote-890TX 
MS DOS version 6.2, 
Windows version 3.11 

Word for Windows version 2.0 converted t 
an ASCI file 



CURRENT APPLICATION DATA: 
tA) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(viii) ATTORNEY/AGENT INFORMATION: 
(A) NAME: 

tB) REGISTRATION NUMBER: 

(C) REFERENCE/ DOCKET NUMBER: 
TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE; 

(B) TELEFAX: 

(C) TELEX: 



(ix) 



Friedmam, Mark M. 

33, 883 

325/45 

972-3-5625553 
972-3-5625554 



(2) INFORMATION FOR SEQ ID N0:1: 

(i) SEQUENCE CHARACTERISTICS: 





(A) 


LENGTH : 


4957 








(B) 


TYPE: 


nucleic acid 






(C) 


STRANDEDNESS : double 








(D) 


TOPOLOGY : 


linear 






{xi) SEQUENCE DESCRIPTION: SEQ ID 


NO: 1 : 




AAGCTTGGAT 


TGAAGCGATC 


GGGGTCAATC 


CCAGCGATGA 


TCCTCAGTTC 


50 


CTCCTGATGG 


TCGATCCCTT 


TAGCGCCAAG 


ATTGAGGATC 


TGCTGCAAGG 


100 


GCTGGATTTC 


GCCTATCCCG 


AGGCCGTGAA 


AGTGGGCGGA 


TTGGCCAGTG 


150 


GTTTGGGGGC 


AGAGTCAGCG 


ATCGCCAGCT 


TGTTTTTTCA 


AGACCGACAG 


200 


GTCGATGGCG 


TGATTGGGCT 


AGCCCTCAGT 


GGCAATGTCC 


AGCTGCAGGC 


250 


GATCGTGGCT 


CAGGGCTGTC 


GTCCAGTTGG 


CCCGCTTTGG 


CATGTGGCAG 


300 


CGGCGGAGCG 


CAACATTCTG 


CGGCAACTTC 


AGACCGAAGA 


CGAGGAACCG 


350 


ATCGCCGCGC 


TGCAAGCCCT 


ACAGTCAGTC 


CTGCGTGATC 


TCTCCCCTGA 


400 


ATTACAGCGA 


TCGCTCTGTG 


TGGGCCTGGC 


CTGCAATTCT 


TTCCAAACGG 


450 


TATTACAACC 


GGGCGACTTC 


CTGATCCGTA 


ACCTGCTGGG 


GTTTGATCCC 


500 


CGCACTGGTG 


CTGTAGCAAT 


CGGCGATCGC 


ATTCGAGTTG 


GGCAGCGGCT 


550 


GCAGCTGCAC 


GTACGGGATG 


CCCAGACAGC 


GGCGGATGAC 


CTCGAGCGGC 


600 


AACTGGGGCA 


ATGGTGCCGG 


CAGCATGCGA 


CAAAACCAGC 


AGCTTCCCTC 


650 


TTGTTTTCCT 


GCTTGGGGCG 


CGGCAAGCCC 


TTCTATCRGC 


AGGCCAACTT 


700 
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CGAGTCGCAA CTGATTCAGC ATTACCTCTC AGAGCTGCCC CTAGCTGGCT 750 
TTTTCTGTAA TGGCGAAATC GGCCCGATCG CTGGCAGCAC CTACCTGCAT 800 
GGCTACACAT CGGTGCTGGC TTTGCTGTCG GCCAAAACTC ACTAGCGCCA 850 
GCGAGACCTG ATTGTCGATC TGCTGAGCGC GACTGTAGCG CTGGAAATAG 900 
GCCCGGACCT GAGCAGGCGC ATCGGCCAAG CTGACCGTAG TATCACCGTC 950 
AGCCACCCCC GCCCAGAAAT TCCGCAACAT CGGCAGGAGA GCGATCGCCT 1000 
CCGCCTCCGA TAAATTCAAC GGCTCATGGG TCAACAGGCG GATCAAGTAC 1050 
TCTGACTGCG ATCGCCATCC ATTCCCGCCG AAAACGTTTG TAAATCAGTC 1100 
TTGATCCGGT AGCGATCGCA CCCGACGGGA CTCTAGTTCT AGTTGCCAAC 1150 
CTTCAGCGGC AGGTTGTACG GTTCCGAGTC GGTAGGGATG GGGATAGCTG 1200 
ACCAAGGAAC CGGTCGTGAC TTCCCAGAGA GCACCTTGCT GACTGGTGGC 1250 
TTGGATGTGG AGGTGGCCTG TGAAGATCAC CGAGACGCTG CCCGCTTCGA 1300 
GGATTGATCG CAATTCCTCG GCATTTTCTA AGATGTAGCG CTGACCAAGC 1350 
GGATGCTGCT GTTGATCGGG CAGATGCTCC AACACATTGT GGTGAATCAT 1400 
CACCCAGCGT TGGCTAGCGG TGGAAGTGGC GAGTTCTTGT TGCAGCCAGT 1450 
TGAGTTGCGC GCAATCGACT CGCCCCCGAT GCAGTTGATG GCCCGCTTCA 1500 
TCAAAAGCGA TCGAATTCAG CGCAAACAGA TCGAGATCCG GTGCGATCGT 1550 
GCAGCGATAG TAGGGGCGAT CGCTCGTGAA GCCAAAGTCT TGATAGAGCT 1600 
CGACAAACTC GGCCACACCG GTGCGATCGC GATCGCTCGC TGCGGCGGGC 1650 
ATATCGTGGT TGCCCGGCAC CACATAGACC GGATAGGGCA ACTGGCGCAA 1700 
TTGTTGCAGC AGCCACTGAT GGTTTTCCCG CTCCCCGTGC TGGGTTAAAT 1750 
CCCCCGGCAG CAACAGGAAG TCCAAATCCA GCGCTGCCAG TTCTGTCAGG 1800 
ATTTGCTCAA AAGCCGGAAT GCTGCACTCA ATCAAATGGA AGCGATGGGG 1850 
ATGGTGCCAA ATTGTCTGCG GCAGTCCAAT GTGGAGATCG CTCAGCAGCG 1900 
CAAATCGAAA CGCTCGGTTC ATTGCCATCC CCTCAGCTAT CGAGCCCGAT 1950 
TCTAGGCGAA GCTAGGTCGA GTCCGTTGTC TTCAGTTGCA AGCATTCATG 2000 
GCCAGAGTTC GCGTTCGGCA GCACGTCAAT CCGCTCTCTC AGAAATTCCA 2050 
AGTGGTCACG ACTTGGCCGG ATTGGCAACA GGTCTATGCG GACTGCGATC 2100 
GCCCGCTGCA TTTGGATATT GGCTGTGCTC GCGGGCGCTT TCTGCTGGCA 2150 
ATGGCGACAC GACAACCTGA GTGGAATTAT CTGGGGCTGG AAATTCGTGA 2200 
GCCGCTGGTA GATGAGGCGA ACGCGATCGC CCGCGAACGT GAACTGACCA 2250 
ATCTCTACTA CCACTTCAGC AACGCCAATT TGGACTTGGA ACCGCTGCTG 2300 
CGATCGCTGC CGACAGGGAT TTTGCAGCGG GTCAGCATTC AGTTCCCGGA 2350 
TCCTTGGTTC AAGAAACGCC ATCAAAAGCG ACGCGTCGTC CAGCCGGAAC 2400 
TGGTGCAAGC CCTCGCGACT GCGTTACCTG CTGGTGCAGA GGTCTTTCTG 2450 
CAATCCGATG TGCTGGAAGT GCAGGCAGAG ATGTGCGAAC ACTTTGCGGC 2500 
GGAACCCCGC TTTCAGCGCA CCTGCTTGGA CTGGCTGCCG GAAAATCCGC 2550 
TGCCCGTCCC GACCGAGCGC GAAATTGCCG TTCAAAACAA ACAGTTGCCA 2 600 
GTCTACCGTG CTCTCTTCAT TCGGCAGCCA GCGGACTAAG CTCTTAAGGC 2 650 
AAGCGTTGAC GCGATCGCGA TGACTGTCTG GCAAACTCTG ACTTTTGCCC 2700 
ATTACCTlACC CCAACAGTGG GGCCACAGCA GTTTCTTGCA TCGGCTGTTT 2750 
GGCAGCCTGC GAGCTTGGCG GGCCTCCAGC CAGCTGTTGG TTTGGTCTGA 2800 
GGCACTGGGT GGCTTCTTGC TTGCTGTCGT CTACGGTTCG GCTCCGTTTG 2850 
TGCCCAGTTC CGCCCTAGGG TTGGGGCTAG CCGCGATCGC GGCCTATTGG 2900 
GCCCTGCTCT CGCTGACAGA TATCGATCTG CGGCTiAGCAA CCCCCATTCA 2950 
CTGGCTGGTG CTGCTCTACT GGGGCGTCGA TGCCCTAGCA ACGGGACTCT 3000 
CACCCGTACG CGCTGCAGCT TTAGTTGGGC TAGCCAAACT GACGCTCTAC 3050 
CTGTTGGTTT TTGCCCTAGC GGCTCGGGTT CTCCGCAATC CCCGTCTGCG 3100 
ATCGCTGCTG TTCTCGGTCG TCGTGATCAC ATCGCTTTTT GTCAGTGTCT 3150 
ACGGCCTCAA CCAATGGATC TACGGCGTTG AAGAGCTGGC GACTTGGGTG 320 0 
GATCGCAACT CGGTTGCCGA CTTCACCTCA CGGGTTTACA GCTATCTGGG 3250 
CAACCCCAAC CTGCTGGCTG CTTATCTGGT GCCGACGACT GCCTTTTCTG 3300 
CAGCAGCGAT CGGGGTGTGG CGCGGCTGGC TCCCCAAGCT GCTGGCGATC 3350 
GCTGCGRCAG GTGCGAGCAG CTTATGTCTG ATCCTCACCT ACAGTCGCGG 3400 
TGGCTGGCTG GGTTTTGTCG CCATGATTTT TGTCTGGGCG TTATTAGGGC 3450 
TCTACTGGTT TCAACCCCGT CTACCCGCAC CCTGGCGACG CTGGCTATTC 3500 
CCAGTCGTAT TGGGTGGACT AGTCGCGGTG CTCTTGGTGG CGGTGCTTGG 3550 



ACTTGAGCCG TTGCGCGTGC GCGTGTTGAG 
ACAGCAGCAA Cf^ACTTCCGG ATCAATGTCT 
ATTCAAGATC GGCCTTGGCT GGGCATCGGC 
CCTGGTTTAT CCCCTCTATC AACAGGCGCG 
ACTCCGTCCC GCTGGAAGTC GCGGTTGAGG 
GCCTTCGCTT GGCTGCTGCT GGTCACGGCG 
GAGCCGACTG CGGCGCGATC GCAATCCCCA 
GCTTGGCCGG TTTGGCAGGA ATGCTGGGTC 
CTCTATCGAC CGGAAGCCAG TACGCTCTGG 
CGCGAGTTTC TGGCAGCCCC AACCTTCCAA 
AGCATTCAGA CGAAAAAATG TAGCGGGCTC 
CCGACTGGAT CCACCACCTA AACTGGATCC 
AGGGTCATAA CGAACTCCGA CCGCGATCGC 
TCGCACCGAA GCGGAGTTCG TTAGTCGTTG 
GCTGCCGAAG CAGTTGGGCT GGAAGCAGGC 
CT^GGCAAAG TTCAGCCGAC CTTCCGCAAA 
CTCTGCCAGC TAAGTCAGCG CTGGGTTAGT 
AAGTTAGGAC AACTTCATAG AGGGACTCGC 
CCGTGGGGGT GCGCAATCAC CCCCACACCC 
CCCCCAGGCC CCCCGCAACA AGATTTCGGA 
GCGATCGCTG CGGGTAAAAC TAGCCGGTGT 
ATCGGCACGG GGCAAAACGT CCTGATTTAT 
CATCGTCAAA AACAAGGCCC AAGAGGTAGG 
TCCGAGGGCT TTGCTGTTGG GAGCGACCTA 
TGCTGTGAGC CAAAGCGCCT TCAATTGCTG 
GGTTGCCAAA TGAAAGACCT TTTCGTCAAT 
CTTCATCACC TTCCAGCTGG GTATTTTTTA 
GGCCGATGGT TCGCAACCCA GTCGCGGCTT 
GTTTCGA 
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CATCTTTGTG GGGCGTGAAG 3600 

GGCTGGCGGT GCTGCAGATG 3650 

CCCGGCAATA CCGCCTTTAA 3700 

CTTTACGGCG TTGAGCGCCT 3750 

GCGGACTACT GGGCTTGACG 3800 

GTGACGGCGG TGCGGCAGGT 3850 

AGCCTTTTGG TTGATGGCTA 3900 

ACGGTCTGTT TGATACCGTG 3950 

TGGCTCTGTA TTGGAGCGAT 4000 

GCAACTCCCT CCAGAAGCCG 4 050 

CCCAACAAAT TCCTGTGCAC 4100 

CAAAGGTATC CGGTGGATCT 4150 

GTCCGCGAAC TGAACCTCCA 4200 

AAGAGCCAAT GCTAGAGGGG 4250 

TGCGAGAAGC CACCCGCATC 4300 

GACTACGATC GCCACGGCGG 4350 

TGTCATAGCA GTCCGCAGAC 4400 

TCAGAGTCAA CAGCCGCTGT 4450 

ACGCACTGGG GGACTCGACT 4500 

TAAGGGGCAT CGGCTGAATC 4550 

TAGCCATGGG TTTGAGACTA 4600 

TTGCTCAATG TGATAGGTTA 4 650 

AAAAATCACG ACCGCCCAAG 4700 

GGGCAGACTA GACAGAGCAT 4750 

GCGGCTGTGG GTTTTTCGGA 4800 

GTCCTCCGCT ATCCCCGCTA 4850 

GTCGATCTAC CAGTGGGTGC 4900 

GGGCGCTGCT AGGCTTTGGA 4 950 

4957 



(2) INFORMATION FOR SEQ ID MO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(Al LENGTH: 1404 

(B) TYPE: nucleic acid 

(C) STEIANDEDNESS : double 





(D) 


TOPOLOGY : 


linear 






(xi) SEQUENCE DESCRIPTION: SEQ ID 


NO: 2: 




ATGACTGTCT 


GGCAAACTCT 


GACTTTTGCC 


CATTACCAAC 


CCCAACAGTG 


50 


GGGCCACAGC 


AGTTTCTTGC 


ATCGGCTGTT 


TGGCAGCCTG 


CGAGCTTGGC 


100 


GGGCCTCCAG 


CCAGCTGTTG 


GTTTGGTCTG 


AGGCACTGGG 


TGGCTTCTTG 


150 


CTTGCTGTCG 


TCTACGGTTC 


GGCTCCGTTT 


GTGCCCAGTT 


CCGCCCTAGG 


200 


GTTGGGGCTA 


GCCGCGATCG 


CGGCCTATTG 


GGCCCTGCTC 


TCGCTGACAG 


250 


ATATCGATCT 


GCGGCAAGCA 


ACCCCCATTC 


ACTGGCTGGT 


GCTGCTCTAC 


300 


TGGGGCGTCG 


ATGCCCTAGC 


AACGGGACTC 


TCACCCGTAC 


GCGCTGCAGC 


350 


TTTAGTTGGG 


CTAGCCAAAC 


TGACGCTCTA 


CCTGTTGGTT 


TTTGCCCTAG 


400 


CGGCTCGGGT 


TCTCCGCAAT 


CCCCGTCTGC 


GATCGCTGCT 


GTTCTCGGTC 


450 


GTCGTGATCA 


CATCGCTTTT 


TGTCAGTGTC 


TACGGCCTCA 


ACCAATGGAT 


500 


CTACGGCGTT 


GAAGAGCTGG 


CGACTTGGGT 


GGATCGCAAC 


TCGGTTGCCG 


550 


ACTTCACCTC 


ACGGGTTTAC 


AGCTATCTGG 


GCAACCCCAA 


CCTGCTGGCT 


600 


GCTTATCTGG 


TGCCGACGAC 


TGCCTTTTCT 


GCAGCAGCGA 


TCGGGGTGTG 


650 


GCGCGGCTGG 


CTCCCCAAGC 


TGCTGGCGAT 


CGCTGCGACA 


GGTGCGAGCA 


700 


GCTTATGTCT 


GATCCTCACC 


TACAGTCGCG 


GTGGCTGGCT 


GGGTTTTGTC 


750 


GCCATGATTT 


TTGTCTGGGC 


GTTATTAGGG 


CTCTACTGGT 


TTCAACCCCG 


800 


TCTACCCGCA 


CCCTGGCGAC 


GCTGGCTATT 


CCCAGTCGTA 


TTGGGTGGAC 


850 


TAGTCGCGGT 


GCTCTTGGTG 


GCGGTGCTTG 


GACTTGAGCC 


GTTGCGCGTG 


900 


CGCGTGTTGA 


GCATCTTTGT 


GGGGCGTGAA 


GACAGCAGCA 


ACAACTTCCG 


950 


GATCAATGTC 


TGGCTGGCGG 


TGCTGCAGAT 


GATTCAAGAT 


CGGCCTTGGC 


1000 



64 

TGGGCATCGG CCCCGGCAAT ACCGCCTTTA ACCTGGTTTA TCCCCTCTAT 1050 

CAACAGGCGC GCTTTACGGC GTTGAGCGCC TACTCCGTCC CGCTGGAAGT 1100 

CGCGGTTGAG GGCGGACTAC TGGGCTTGAC GGCCTTCGCT TGGCTGCTGC 1150 

TGGTCACGGC GGTGACGGCG GTGCGGCAGG TGAGCCGACT GCGGCGCGAT 1200 

CGCAATCCCC AAGCCTTTTG GTTGATGGCT AGCTTGGCCG GTTTGGCAGG 1250 

AATGCTGGGT CACGGTCTGT TTGATACCGT GCTCTATCGA CCGGAAGCCA 1300 

GTACGCTCTG GTGGCTCTGT ATTGGAGCGA TCGCGAGTTT CTGGCAGCCC 1350 

CAACCTTCCA AGCAACTCCC TCCAGAAGCC GAGCATTCAG ACGAAAAAAT 1400 

GTAG 1404 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 67 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
Met Thr Val Trp Gin Thr Leu Thr Phe Ala His Tyr Gin Pro Gin 

5 10 15 

Gin Trp Gly His Ser Ser Phe Leu His Arg Leu Phe Gly Ser Leu 

20 25 30 

Arg Ala Trp Arg Ala Ser Ser Gin Leu Leu Val Trp Ser Glu Ala 

35 40 45 

Leu Gly Gly Phe Leu Leu Ala Val Val Tyr Gly Ser Ala Pro Phe 

50 55 60 

Val Pro Ser Ser Ala Leu Gly Leu Gly Leu Ala Ala lie Ala Ala 

65 70 75 

Tyr Trp Ala Leu Leu Ser Leu Thr Asp lie Asp Leu Arg Gin Ala 

80 85 90 

Thr Pro lie His Trp Leu Val Leu Leu Tyr Trp Gly Val Asp Ala 

95 100 105 

Leu Ala Thr Gly Leu Ser Pro Val Arg Ala Ala Ala Leu Val Gly 

110 115 120 

Leu Ala Lys Leu Thr Leu Tyr Leu Leu Val Phe Ala Leu Ala Ala 

125 130 135 

Arg Val Leu Arg Asn Pro Arg Leu Arg Ser Leu Leu Phe Ser Val 

140 145 150 

Val Val lie Thr Ser Leu Phe Val Ser Val Tyr Gly Leu Asn Gin 

155 160 165 

Trp He Tyr Gly Val Glu Glu Leu Ala Thr Trp Val Asp Arg Asn 

170 175 180 

Ser Val Ala Asp Phe Thr Ser Arg Val Tyr Ser Tyr Leu Gly Asn 

185 190 195 

Pro Asn Leu Leu Ala Ala Tyr Leu Val Pro Thr Thr Ala Phe Ser 

200 205 210 

Ala Ala Ala He Gly Val Trp Arg Gly Trp Leu Pro Lys Leu Leu 

215 220 225 

Ala He Ala Ala Thr Gly Ala Ser Ser Leu Cys Leu He Leu Thr 

230 235 240 

Tyr Ser Arg Gly Gly Trp Leu Gly Phe Val Ala Met He Phe Val 

245 250 255 

Trp Ala Leu Leu Gly Leu Tyr Trp Phe Gin Pro Arg Leu Pro Ala 

260 265 270 

Pro Trp Arg Arg Trp Leu Phe Pro Val Val Leu Gly Gly Leu Val 

275 280 285 

Ala Val Leu Leu Val Ala Val Leu Gly Leu Glu Pro Leu Arg Val 

290 295 300 
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Arg Val Leu Ser He Phe Val Gly Arg Glu Asp Ser Ser Asn Asn 

305 310 315 

Phe Arg He Asn Val Trp Leu Ala Val Leu Gin Met He Gin Asp 

320 325 330 

Arg Pro Trp Leu Gly He Gly Pro Gly Asn Thr Ala Phe Asn Leu 

335 340 345 

Val Tyr Pro Leu Tyr Gin Gin Ala Arg Phe Thr Ala Leu Ser Ala 

350 355 360 

Tyr Ser Val Pro Leu Glu Val Ala Val Glu Gly Gly Leu Leu Gly 

365 370 375 

Leu Thr Ala Phe Ala Trp Leu Leu Leu Val Thr Ala Val Thr Ala 

380 385 390 

Val Arg Gin Val Ser Arg Leu Arg Arg Asp Arg Asn Pro Gin Ala 

395 400 405 

Phe Trp Leu Met Ala Ser Leu Ala Gly Leu Ala Gly Met Leu Gly 

410 415 420 

His Gly Leu Phe Asp Thr Val Leu Tyr Arg Pro Glu Ala Ser Thr 

425 430 435 

Leu Trp Trp Leu Cys He Gly Ala He Ala Ser Phe Trp Gin Pro 

440 445 450 

Gin Pro Ser Lys Gin Leu Pro Pro Glu Ala Glu His Ser Asp Glu 

455 460 465 

Lys Met 

(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1425 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 
ATGGTGTCTC CCATCTCTAT CTGGCGATCG CTGATGTTTG GCGGTTTTTC 50 
CCCCCAGGAA TGGGGCCGGG GCAGTGTGCT CCATCGTTTG GTGGGCTGGG 100 
GACAGAGTTG GATACAGGCT AGTGTGCTCT GGCCCCACTT CGAGGCATTG 150 
GGTACGGCTC TAGTGGCAAT AATTTTTATT GCGGCTCCCT TCACCTCCAC 200 
CACCATGTTG GGCATTTTTA TGCTGCTCTG TGGAGCCTTT TGGGCTCTGC 250 
TGACCTTTGC TGATCAACCA GGGAAGGGTT TGACTCCCAT CCATGTTTTA 300 
GTTTTTGCCT ACTGGTGCAT TTCGGCGATC GCCGTGGGAT TTTCTCCGGT 350 
AATiAATGGCG GCGGCGTCGG GGTTAGCGAA ATTAACAGCT AATTTATGTC 400 
TGTTTCTACT GGCGGCGAGG TTATTGCAAA ACAAACAATG GTTGAACCGG 4 50 
TTAGTAACCG TTGTTTTACT GGTAGGGCTA TTGGTGGGGA GTTACGGTCT 500 
GCGACAACAG GTGGACGGGG TAGAACAGTT AGCCACTTGG AATGACCCCA 550 
CCTCTACCTT GGCCCAGGCC ACTAGGGTAT ATAGCTTTTT AGGTAATCCC 600 
AATCTCTTGG CGGCTTACCT GGTGCCCATG ACGGGTTTGA GCTTGAGTGC 650 
CCTGGTGGTA TGGCGACGGT GGTGGCCCAA ACTGCTGGGA GCAACCATGG 700 
TGATTGTTAA CCTACTCTGT CTCTTTTTTA CCCAGAGCCG GGGCGGTTGG 750 
CTAGCAGTGC TGGCCCTGGG AGCTACCTTC CTGGCCCTTT GTTACTTCTG 800 
GTGGTTACCC CAATTACCCA AATTTTGGCA ACGGTGGTCT TTGCCCCTGG 850 
CGATCGCCGT GGCGGTTATA TTAGGTGGGG GAGCGTTGAT TGCGGTGGAA 900 
CCGATTCGAC TCAGGGCCAT GAGCATTTTT GCTGGGCGGG AAGACAGCAG 950 
TAATAATTTC CGCATCAATG TTTGGGAAGG GGTAAAAGCC ATGATCCGAG 1000 
CCCGCCCTAT CATTGGCATT GGCCCAGGTA ACGAAGCCTT TAACCAAATT 1050 
TATCCTTACT ATATGCGGCC CCGCTTCACC GCCCTGAGTG CCTATTCCAT 1100 
TTACCTAGAA ATTTTGGTGG AAACGGGTGT AGTTGGTTTT ACCTGTATGC 1150 
TCTGGCTGTT GGCCGTTACC CTAGGCAAAG GCGTAGAACT GGTTAAACGC 1200 
TGTCGCCAAA CCCTCGCCCC GGAAGGCATC TGGATTATGG GGGCTTTAGC 1250 
GGCGATCATC GGTTTGTTGG TCCACGGCAT GGTAGATACA GTCTGGTACC 1300 
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GTCCCCCGGT GAGCACTTTG TGGTGGTTGC TAGTGGCCAT TGTTGCTAGT 1350 

CAGTGGGCCA GCGCCCAGGC CCGTTTGGAG GCCAGTAAAG AAGAAAATGA 1400 

GGACAAACCT CTTCTTGCTT CATAA 1425 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 474 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
Met Val Ser Pro lie Ser lie Trp Arg Ser Leu Met Phe Gly Gly 

5 10 15 

Phe Ser Pro Gin Glu Trp Gly Arg Gly Ser Val Leu His Arg Leu 

20 25 30 

Val Gly Trp Gly Gin Ser Trp lie Gin Ala Ser Val Leu Trp Pro 

35 40 45 

His Phe Glu Ala Leu Gly Thr Ala Leu Val Ala lie lie Phe lie 

50 55 60 

Ala Ala Pro Phe Thr Ser Thr Thr Met Leu Gly lie Phe Met Leu 

65 70 75 

Leu Cys Gly Ala Phe Trp Ala Leu Leu Thr Phe Ala Asp Gin Pro 

80 85 90 

Gly Lys Gly Leu Thr Pro lie His Val Leu Val Phe Ala Tyr Trp 

95 100 105 

Cys lie Ser Ala lie Ala Val Gly Phe Ser Pro Val Lys Met Ala 

110 115 120 

Ala Ala Ser Gly Leu Ala Lys Leu Thr Ala Asn Leu Cys Leu Phe 

125 130 135 

Leu Leu Ala Ala Arg Leu Leu Gin Asn Lys Gin Trp Leu Asn Arg 

140 145 150 

Leu Val Thr Val Val Leu Leu Val Gly Leu Leu Val Gly Ser Tyr 

155 160 165 

Gly Leu Arg Gin Gin Val Asp Gly Val Glu Gin Leu Ala Thr Trp 

170 175 180 

Asn Asp Pro Thr Ser Thr Leu Ala Gin Ala Thr Arg Val Tyr Ser 

185 190 195 

Phe Leu Gly Asn Pro Asn Leu Leu Ala Ala Tyr Leu Val Pro Met 

200 205 210 

Thr Gly Leu Ser Leu Ser Ala Leu Val Val Trp Arg Arg Trp Trp 

215 220 225 

Pro Lys Leu Leu Gly Ala Thr Met Val lie Val Asn Leu Leu Cys 

230 235 240 

Leu Phe Phe Thr Gin Ser Arg Gly Gly Trp Leu Ala Val Leu Ala 

245 250 255 

Leu Gly Ala Thr Phe Leu Ala Leu Cys Tyr Phe Trp Trp Leu Pro 

260 265 270 

Gin Leu Pro Lys Phe Trp Gin Arg Trp Ser Leu Pro Leu Ala lie 

275 280 285 

Ala Val Ala Val lie Leu Gly Gly Gly Ala Leu lie Ala Val Glu 

290 295 300 

Pro lie Arg Leu Arg Ala Met Ser lie Phe Ala Gly Arg Glu Asp 

305 310 315 

Ser Ser Asn Asn Phe Arg lie Asn Val Trp Glu Gly Val Lys Ala 

320 325 330 

Met lie Arg Ala Arg Pro lie lie Gly lie Gly Pro Gly Asn Glu 

335 340 345 



Ala Phe Asn 
Ala Leu Ser 
Gly Val Val 
Leu Gly Lys 
Ala Pro Glu 
Gly Leu Leu 
Pro Val Ser 
Gin Trp Ala 
Asn Glu Asp 



Gin lie Tyr 

350 
Ala Tyr Ser 

365 

Gly Phe Thr 
380 

Gly Val Glu 
395 

Gly lie Trp 

410 
Val His Gly 

425 

Thr Leu Trp 

440 
Ser Ala Gin 

455 
Lys Pro Leu 

470 



Pro Tyr Tyr 
lie Tyr Leu 
Cys Met Leu 
Leu Val Lys 
lie Met Gly 
Met Val Asp 
Trp Leu Leu 
Ala Arg Leu 
Leu Ala Ser 
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Met Arg Pro 
355 

Glu lie Leu 
370 

Trp Leu Leu 
385 

Arg Cys Arg 
400 

Ala Leu Ala 
415 

Thr Val Trp 
430 

Val Ala He 
445 

Glu Ala Ser 
460 



Arg Phe Thr 
360 

Val Glu Thr 
375 

Ala Val Thr 
390 

Gin Thr Leu 
405 

Ala lie lie 
420 

Tyr Arg Pro 
435 

Val Ala Ser 
450 

Lys Glu Glu 
465 



(2) 



INFORMATION FOR SEQ ID NO: 6; 



(i) 



SEQUENCE CHARACTERISTICS: 



(A) LENGTH: 

(B) TYPE: 

(C) STRANDEDNESS: 

(D) TOPOLOGY: 
(xi) SEQUENCE DESCRIPTION: 

GGGCTAGCCG CGATCGCGGC CTATTGGGCC C 31 



31 

nucleic acid 

double 

linear 

SEQ ID NO: 6: 



(2) 



INFORMATION FOR SEQ ID NO: 7 



(i) 



(xi) 



SEQUENCE CHARACTERISTICS: 



(A) LENGTH: 

(B) TYPE: 

(C) STRANDEDNESS: 

(D) TOPOLOGY: 
SEQUENCE DESCRIPTION: 



GGGCTAGGGA TCGCGCCTAT TGGGCCC 27 



27 

nucleic acid 

double 

linear 

SEQ ID NO: 7: 



(2) 



INFORMATION FOR SEQ ID NO: 8: 



(i) 



SEQUENCE CHARACTERISTICS: 



(A) LENGTH: 

(B) TYPE: 

(C) STRANDEDNESS: 

(D) TOPOLOGY: 
(xi) SEQUENCE DESCRIPTION 

GGGCTCAGAT CGCGCCTATT GGGCCC 2 6 



26 

nucleic acid 

double 

linear 

SEQ ID NO: 8: 



(2) 



INFORMATION FOR SEQ ID NO : 9 



ti> 



(xi) 



SEQUENCE CHARACTERI STICS : 



(A) LENGTH: 

(B) TYPE: 

(C) STRANDEDNESS; 

(D) TOPOLOGY: 



SEQUENCE DESCRIPTION: 
GLy Leu Ala Ala lie Ala Ala Tyr Trp Ala Leu 

5 10 



11 

amino acid 
single 
linear 
SEQ ID NO: 9: 



